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Abstract 

Background: Previous work has shown that the hypersaline-adapted archaeon, Halobacterium salinarum NRC-1, is 
highly resistant to oxidative stress caused by exposure to hydrogen peroxide, UV, and gamma radiation. Dynamic 
alteration of the gene regulatory network (GRN) has been implicated in such resistance. However, the molecular 
functions of transcription regulatory proteins involved in this response remain unknown. 

Results: Here we have reanalyzed several existing GRN and systems biology datasets for H. salinarum to identify 
and characterize a novel winged helix-turn-helix transcription factor, VNG0258H, as a regulator required for reactive 
oxygen species resistance in this organism. This protein appears to be unique to the haloarchaea at the primary 
sequence level. High throughput quantitative growth assays in a deletion mutant strain implicate VNG0258H in 
extreme oxidative stress resistance. According to time course gene expression analyses, this transcription factor is 
required for the appropriate dynamic response of nearly 300 genes to reactive oxygen species damage from 
paraquat and hydrogen peroxide. These genes are predicted to function in repair of oxidative damage to proteins 
and DNA. In vivo DNA binding assays demonstrate that VNG0258H binds DNA to mediate gene regulation. 

Conclusions: Together these results suggest that VNG0258H is a novel archaeal transcription factor that regulates 
gene expression to enable adaptation to the extremely oxidative, hypersaline niche of H, salinarum. We have 
therefore renamed VNG0258H as RosR, for reactive oxygen species regulator. 
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Background 

Halobacterium salinarum, an extremely halophilic eur- 
yarchaeon that resides in salt lakes and marine salterns, 
requires nearly saturated salt for growth and survival 
(100-150 g/L) [1]. In these environments, UV damage 
from intense sunlight and desiccation-rehydration cycles 
generate high levels of reactive oxygen species (ROS) 
and damage DNA and proteins [2], H. salinarum is 
highly resistant to ROS damage, withstanding many 
times what E. coli and other radiation-sensitive organ- 
isms can survive [3]. Like other ROS-resistant microbes 
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such as Deinococcus radiodurans, H. salinarum uses a 
battery of enzymatic and non-enzymatic strategies to 
withstand macromolecular damage. These include func- 
tional redundancy of DNA repair and antioxidant 
enzyme-coding genes [4-6]; a high cytosolic Mn(II) to Fe 
(III) ratio [7-9]; genomic polyploidy to provide templates 
for DNA double strand break repair [10]; and differential 
regulation of genes encoding macromolecular repair 
functions in response to oxidative stress [11]. 

Particularly striking is the effect of ROS on the gene 
regulatory network (GRN) of H. salinarum. Computa- 
tional inference methods on global gene expression data 
suggest that more than 80 predicted DNA binding pro- 
teins work together to bring about a concerted, dynamic 
gene expression response to neutralize ROS toxicity and 
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repair macromolecular damage [11]. In addition, the 
molecular functions of these putative ROS-responsive 
regulators remain unclear in this organism and other 
archaeal species. 

Transcription mechanisms in archaea are a chimera of 
eukaryotic and bacterial components. General transcrip- 
tion factors in archaea (e.g. TATA-binding protein and 
TFIIB homologs) more closely resemble those of eukar- 
yotes, whereas archaeal activators and repressors resem- 
ble those of bacteria [12]. Bacterial-type transcription 
factors (TFs) of the helbc-turn-helix class of DNA bind- 
ing proteins are particularly overrepresented in available 
sequenced archaeal genomes [13-15]. Compared with 
the substantial information on TF function in other 
domains of life, relatively few of the -4,000 predicted 
archaeal TFs [14] have been assigned a known function 
in vivo despite intense interest in recent years [16-25]. 

Here we identify and characterize the function of 
VNG0258H, a putative TF comprised of a winged helix- 
turn-helix (wHTH) domain and an uncharacterized do- 
main unique to a subset of haloarchaeal species. We 
used existing systems biology datasets to generate the 
hypothesis that VNG0258H may function in the re- 
sponse to ROS and/or oxygen perturbations. To test 
this, we generated a VNG0258H deletion mutant and 
monitored global gene expression dynamics and high 
throughput growth physiology in this strain in response 
to hydrogen peroxide (H2O2) and paraquat (PQ). Results 
suggest that VNG0258H is required for ROS resistance 
and modulates the expression of genes encoding pro- 
teins involved in repairing cellular damage from ex- 
tremely high levels of reactive oxygen species (ROS). In 
vivo binding assays demonstrate that VNG0258H binds 
directly to the promoter of sod2, encoding the [Mn] 
superoxide dismutase. We conclude that VNG0258H is 
a unique haloarchaeal TF required for the response to 
extreme oxidative stress endemic to hypersaline environ- 
ments. We have therefore renamed VNG0258H as RosR, 
reactive oxygen species regulator. 

Methods 

Strains and growth conditions 

All strains used in this study are listed in Additional 
file 1: Table S4. Briefly, Halobacterium salinarum NRC-1 
(ATCC700922) was used to determine the in vivo function 
of VNG0258H. A strain harboring an in-frame deletion of 
VNG0258H was constructed in the Aura3 uracil auxo- 
troph parent strain as described previously [26]. H. sali- 
narum strains harboring VNG02S8H fused to the c-myc 
epitope at its C-terminus and driven by the VNG2293G 
strong constitutive promoter on a low-copy number plas- 
mid was constructed as described previously [27] . For cul- 
turing the strains carrying the VNG0258H::c-myc and 
trmB::c-myc fusions (used for ChlP-qPCR and growth 



assays), cultures were supplemented with 20 ug/mL mevi- 
nolin for plasmid maintenance. For routine culturing, H. 
salinarum Aura3 parent and Aura3AVNG02S8H deletion 
mutant strains were grown in complete medium (CM; 
250 g/L NaCl, 20 g/L MgS0 4 -7H 2 0, 3 g/L sodium citrate, 
2 g/L KG, 10 g/L peptone) supplemented with uracil 
(50 mM) to complement the Aura3 auxotrophy. 

High throughput growth curves 

Starter cultures of H. salinarum NRC1, Aura3 parent 
strain, AVNG0258H, or AVNG0258H cells complemen- 
ted with VNG02S8::c-myc on a plasmid (Additional file 1: 
Table S4) were grown to OD600 -1.0 in 50 mL CM sup- 
plemented with 50 mM uracil. Culture aliquots (200 uL) 
were grown at 37°C for 48 hours under continuous shak- 
ing (-225 rpm) in a Bioscreen C microbial growth 
analyzer (Growth Curves USA, Piscataway, NJ) set to 
measure optical density at 600 nm automatically every 30 
minutes for 200 culture samples simultaneously. For con- 
tinuous H2O2 and paraquat (PQ) exposure experiments, 
cultures were diluted in CM-uracil to OD600 -0.05 and 
supplemented with 30% (v/v) H2O2 to final concentrations 
of 5, 6, 7, 12.5, 18.75, or 25 mM H 2 0 2 . 100 mM PQ was 
added to final concentrations of 0.083, 0.167, or 
0.333 mM. These ROS conditions have been used previ- 
ously and are also used here as proxies for the continuous 
high-level UV exposure that H. salinarum experiences on 
a routine basis in its salt lake habitat [11,28]. For shock 
experiments, oxidant was added to growing cultures in 
logarithmic phase at OD 600 0.250 to 0.375 (as measured in 
a standard lxl cm path-length cuvette spectrophotom- 
eter). At least 4 biological replicate trials were conducted 
for each strain under each condition. Growth rate was cal- 
culated independently for each growth curve by taking the 
slope of the linear regression fit to log2-transformed 
curves from 12 to 24 hours for continuous exposure 
experiments, and from 20 to 32 hours for shock experi- 
ments. Individual growth rates were then averaged by 
strain and growth condition. Averages, standard deviations, 
and results of non-parametric paired £-tests (comparing H. 
salinarum Aura3 to AVNG0258H strain growth under each 
condition) are reported in the Figures. See Additional file 2 
for supplementary methods regarding growth assays. See 
Additional file 3: Table SI for all raw and analyzed growth 
data. 

Gene expression microarray sample preparation, 
hybridization, and data analysis 

H. salinarum Aura3 parent and AVNG0258H mutant 
strains were grown in CM supplemented with uracil to 
mid-logarithmic phase (OD 600 - 0.5). For H 2 0 2 time 
courses, 4-mL culture aliquots were removed for RNA 
extraction at three time points prior to the addition of 
25 mM H 2 0 2 (-40 min, -20 min, 0 min) and five time 
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points following H 2 0 2 addition (10, 20, 40, 60, 80 minutes). 
Paraquat (PQ) time courses were prepared similarly with 
the exception that additional time points were taken at 2 h, 
8 h, and 24 h after the addition of 0.333 mM PQ to assess 
long-term expression patterns. For each biological duplicate 
time course, all samples were removed from the same 
culture to ensure coherence of gene expression between 
unstressed and stressed cultures. From each sample, cells 
were immediately pelleted by centrifugation (12,000g, 
30 sec, 25°C) and snap-frozen in liquid nitrogen. Sample 
pellets were stored overnight at -80°C, followed by RNA 
preparation using the Absolutely-RNA kit (Stratagene, La 
Jolla, CA) according to the manufacturer's instructions. 
RNA quality was assessed using the Bioanalyzer 2100 
(Agilent Technologies, Santa Clara, CA). Freedom from 
DNA contamination was ensured by PCR amplification of 
200 ng of each RNA sample. 600 ng of each quality- 
checked RNA sample directly labeled with Cy3 and Cy5 
dyes (Kreatech) as described previously [29,30] and com- 
bined in equimolar amounts with oppositely labeled H. 
salinarum NRC-1 reference RNA (from batch cultures 
grown in CM at 37°C to mid-logarithmic phase). This 
common reference RNA was used across all -950 micro- 
array experiments listed in the H. salinarum NRC-1 
microarray data repository [31]. Samples were hybridized 
to a custom 60-mer oligonucleotide microarray (Agilent 
technologies, Santa Clara, CA, 8 x 15,000 feature array, 
AMADID ID #30108, GEO platform accession GPL14876). 
This array contains 2,410 non-redundant open reading 
frames (ORFs) of the H. salinarum NRC-1 genome. Probes 
for each ORF were spotted on each array six-fold and dye- 
swapping was conducted (to rule out bias in dye incorpor- 
ation) for all samples, yielding 12 technical replicates per 
gene per time point. Slide hybridization and washing proto- 
cols were performed according to the manufacturers 
instructions, except that hybridization was conducted in 
the presence of 37.5% formamide at 68°C to ensure proper 
stringency due to the high G + C content of the H. sali- 
narum genome (67%, [32]). 

Slide scanning and spotfinding were conducted using 
Feature Extraction software (Agilent). Within the R 
Bioconductor [33] m- array and limma packages [34], 
resultant raw data were background-subtracted using 
normexp [35], Loess normalized within each array, and 
quantile normalized between all arrays. Any of the 12 
gene-specific probes for each gene lying outside the 99 th 
% confidence interval were removed using Dixon's test 
[36]. Finally, remaining probe intensities for each gene 
were averaged and log 2 ratios were calculated, yielding 
one expression ratio per gene. Resultant processed data 
are listed in Additional file 4: Table S2 and Additional 
file 5: Table S3. Both raw and processed microarray data 
are also available through the NCBI Gene Expression 
Omnibus (GEO) accession number GSE33980. 



In vivo DNA binding assays with ChlP-qPCR 

H. salinarum harboring VNG0258H::myc was grown to 
mid-logarithmic phase (OD 600 ~0.5) in CM supplemen- 
ted with mevinolin. Transcription factor-chromatin 
complexes were then cross-linked in vivo with 1% 
formaldehyde for 20 min at room temperature and sub- 
jected to immunoprecipitation (IP) by virtue of the myc 
epitope tag as described previously [27]. Primers (Inte- 
grated DNA Technologies, Coralville, IA) were designed 
according to criteria described in [37] and are listed in 
Additional file 1: Table S4. ChIP samples from trmBr.c- 
myc cells were run simultaneously as controls, since TrmB 
is a transcription factor previously shown not to bind the 
region of interest [20]. Quantitative PCR (qPCR) reaction 
and thermocycling conditions were as described in [27]. 
Each of the five biological replicate samples of RosR ChIP 
were run in triplicate qPCR reactions for a total of 15 data 
points per sample. Reactions with C T values greater than 
0.5 standard deviations from the triplicate mean were 
excluded from analysis. Enrichment of RosR binding at 
each promoter locus was calculated in each ChIP sample 
compared to the input sample using relative quantitation 
as described [27]. Resultant data reported represent the 
mean of all trials ± SEM. 

Systems biology data analysis, integration, and 
visualization 

All systems biology datasets were analyzed and visua- 
lized in the context of the web executable, interoperable 
Gaggle data analysis environment [38] and other existing 
online database tools. Specifically, predictions and 
hypotheses were made using the existing GRN for H. 
salinarum [11] and explored in Cytoscape [39]. Amino 
acid sequences of VNG0258H homologs from other 
halophilic archaea were compared using PSI-BLAST 
[40] in the context of the Halolex database [41] and 
NCBI GeneBank. Sequences were aligned using ClustalW 
[42]. Transcriptome structure data for the VNG0258H ge- 
nomic locus [43] (see Results) were visualized using the H. 
salinarum genome database [44] and the Gaggle Genome 
Browser [45]. 

The TM4 MultiExperiment Viewer (MeV) application 
[46] within the Gaggle environment was used for statis- 
tical analysis of microarray gene expression datasets. 
Specifically, Significance Analysis of Microarrays (SAM, 
a i-test-based method) was used to detect gene groups 
with significantly different expression levels in the Aura3 
parent and AVNG0258H mutant strains. Genes signifi- 
cantly up- or down-regulated in A VNG0258H were con- 
sidered to be VNG0258H-dependent. Genes with 
significantly different expression in PQ or H 2 0 2 vs. 
standard conditions in the Aura3 parent strain but not 
AVNG0258H were considered to be PQ or H 2 0 2 -re- 
sponsive but VNG0258H-independent. The latter group 



Sharma et ah BMC Genomics 2012, 13:351 
http://www.biomedcentral.com/1471-2164/13/351 



Page 4 of 18 



was further subjected to KMEANS analysis to detect 
genes with altered dynamics in AVNG0258H cells. 
Annotations for genes within resultant clusters were 
analyzed using the Firegoose web portal within the Gag- 
gle [47]. Annotated genes were subsequently grouped by 
arCOG annotations [48] using the R Bioconductor pack- 
age within the Gaggle environment. Significance of en- 
richment within arCOG categories was calculated using 
term-for-term analysis as described [49]. 

C«-regulatory sequence predictions were conducted 
using the MEME online software package [50] with two dif- 
ferent sequence inputs: (a) open reading frames and 500 bp 
upstream sequence of the 50 genes differentially expressed 
in both H2O2 and PQ datasets; (b) promoter sequences of 
the sod2 gene from the 8 haloarchaeal genomes containing 
RosR homologs. Searches on each type of sequence input 
were constrained to 6-20 bp motifs. Palindromic output 
was not enforced. MEME was run in discriminative mode 
using the first 250 kbp of the H. salinarum genome as 
negative sequence. Output sequence position weight matri- 
ces were visualized in sequence logo format using the 
WebLogo package (weblogo.berkeley.edu). 

Results 

Using existing systems biology datasets to identify 
candidate regulators of reactive oxygen species (ROS) 
stress response 

To identify transcription regulatory proteins involved in 
the response to ROS and/or oxygen-related physiology 
in H. salinarum, we mined the existing systems biology 
datasets for this organism to identify candidate 
transcription factors. These data types include (1) the 
computationally inferred GRN; (2) changes in mRNA 
abundance microarray data during perturbations in oxy- 
gen and ROS conditions [11,30]; (3) genome-wide tran- 
scriptome structure [43]; and (4) proteomics data [51]. 

The existing GRN models have implicated approxi- 
mately 80 TFs in the response to ROS [11,31]- However, 
some of these TFs exhibit similar changes in mRNA 
abundance in response to ROS, and so the inference 
procedure frequently groups several TFs into a single 
regulatory node [31]. Thus, the computational inference 
procedure cannot discern which TF within a group regu- 
lates which target genes, nor can it distinguish direct 
from indirect regulatory influences. We reasoned that 
slight differences in the expression profiles of TFs within 
the same node may not have been detected by the infer- 
ence procedure, but may become evident upon closer in- 
spection. We therefore re-examined the gene expression 
profiles of each of the 80 TFs in the GRN under oxida- 
tive conditions (i.e. in the presence of high oxygen, para- 
quat, or hydrogen peroxide). In response to changes in 
oxygen levels, the expression pattern of one putative TF, 
VNG0258H, ranks first of all the TFs in the genome in 



(1) correlation with genes associated with aerobic physi- 
ology, including TCA cycle and electron transport (Cp = 
0.327; Figure 1A) [30]; (2) anti-correlation with genes 
associated with anaerobic physiology, including DMSO 
reduction and phototrophy (Cp = -0.621; Figure 1A) 
[30]; and (3) magnitude of change (3.5-fold down- 
regulated with low oxygen and 2-fold up-regulated in 
high oxygen). In response to hydrogen peroxide (H2O2) 
exposure [9,11], VNG0258H expression is anti-correlated 
with clusters of genes associated with cobalamin biosyn- 
thesis, iron homeostasis, and redox reactions (Figure IB). 
Gene expression correlations are more complex in response 
to the redox cycling drug paraquat (PQ), with some clusters 
correlated and others anti-correlated with the VNG0258H 
expression profile (Figure 1C). 

Genome-wide whole transcript mapping data indicates 
that the VNG0258H gene is transcribed as a monocistro- 
nic message flanked by genes of unknown function [43]. 
The VNG02S8H protein product has been detected by 
mass spectrometric proteomics in the presence of oxy- 
gen [30] and during recovery from high levels of gamma 
radiation [2,51], confirming that the annotated ORF 
encodes a bona fide protein that is expressed under 
similar conditions as the VNG0258H transcript. Based 
on this new perspective on existing systems biology data- 
sets for H. salinarum, we hypothesize that VNG02S8H 
encodes a putative TF that may play a role in the response 
to oxygen and/or oxidative stress conditions. 

Sequence homology suggests that VNG0258H may 
represent a class of DNA binding proteins unique to 
haloarchaea 

Primary amino acid sequence homology suggests that 
VNG0258H contains a central domain that bears weak 
amino acid sequence homology (39% identity, E- 
value < 0.007) to the GntR winged helix-turn-helix 
(wHTH) superfamily of bacterial transcription factors, 
which includes the MarR and PadR families (PFAM 
03551, Figure 2). Compared to characterized bacterial 
MarR family members, one residue in the VNG0258H 
HTH region known to be important for binding the 
major groove of DNA is conserved, as are two residues 
in the wing region that bind the minor groove (Figure 2, 
[52]). PSI-BLAST searches with the whole VNG0258H 
protein sequence and the short N- and C-terminal 
domains flanking the central wHTH domain of 
VNG0258H matched only those from a small clade of 
halophilic archaea (Figure 2). Together these sequence 
data are consistent with the hypothesis that VNG0258H 
may represent a class of transcription factors unique to 
the haloarchaea. To our knowledge, none of these puta- 
tive archaeal DNA binding proteins has been function- 
ally characterized. 
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Figure 1 VNG0258H gene expression in response to H 2 0 2 and oxygen. (A) Comparison of VNG0258H gene expression to that of genes 
involved in aerobic and anaerobic physiology [30]. The x-axis represents shifts in oxygen levels over time in a fermentor. Graph background 
shading corresponds to the relative oxygen concentration. "High" oxygen represents 100% oxygen saturation in CM medium (5 uM) as measured 
by a dissolved oxygen probe. "Low" represents 5% saturation or below [30], The y-axis represents mean and variance normalized log 10 expression 
ratios compared to mid-logarithmic phase H. salinarum. The green trace represents VNG0258H gene expression, whereas black and red traces 
represent the mean expression profiles for genes encoding proteins involved in aerobic and anaerobic physiology, respectively [30]. (B) Mean 
gene expression profiles for clusters of genes correlated with VNG0258H mRNA changes in response to H 2 0 2 [1 1]. See legend for colors. (C) Mean 
gene expression profiles in response to PQ for genes from (A). Colors are as in (B). 



AVNG0258H growth is impaired in the presence of H 2 0 2 
and paraquat 

To test the hypothesized role of VNG0258H in the re- 
sponse to oxygen and/or oxidative stress conditions, 
we generated a strain of H. salinarum strain deleted 
for VNG0258H (Methods). We measured its response 
to varying H 2 0 2 and paraquat (PQ) concentrations in 
different phases of growth. Under standard aerobic 
growth conditions, the H. salinarum AVNG02S8H 
mutant strain grows similarly to the isogenic Aura3 
parent strain (Figure 3A). However, when H2O2 is 
added, the mutant exhibits a significant growth defect 
(Figures 3B, C, D, E, Additional file 6: Figure SI). The 



greatest difference in growth rate between the 
AVNG0258H and Aura3 strains is observed at 6 mM 
H 2 0 2 added at inoculation (Figure 3B, p < 7.9 x 1(T 7 ) 
and 18.75 mM H 2 0 2 added in mid-logarithmic phase 
(Figure 3D, p<2A x 1(T 10 ). These AVNG0258H 
growth defects are significantly complemented in trans 
by a constitutively expressed, plasmid-borne wild type 
copy of the VNG0258H gene (Additional file 7: Figure 
S2; Methods). Both strains are completely growth- 
inhibited when challenged with 7 mM H 2 0 2 at time 
of inoculation (Figures 3B and 3C) or 25 mM H2O2 
added in mid-logarithmic phase (Figures 3D, 3E, and 
Additional file 3: Table SI), suggesting a relationship 
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Figure 2 Homology of VNG0258H winged helix-turn-helix (wHTH) putative transcription factor with haloarchaeal homologs and 
bacterial matches to wHTH domain. Residues in bold blue font depict those known to interact with the major groove of OhrR in B. subtilis, 
whereas those in bold red letters represent residues of the wing that contact the minor groove [52]. In the N- or C-terminal domains (white 
overbar), no homology was detected outside the halophilic archaea. Perfectly conserved residues are shaded black, whereas conservatively 
substituted residues are shaded grey. Black overbars designate characterized helix-turn-helix (HTH) and wing regions from bacterial MarR family 
members. Hvo_0730, Haloferax volcanii (GenBank genome accession NC002945); Hwa, Haloquadratum walsbyi (NC_008212); Htu, Haloterrigena 
turkmenica (NC_013743); Nmag, Nathalba magadii (NC_013922); Huta, Halorhabdus utahensis (NC_01 31 58); HacjB3, Halalkalicoccus jeotagali.B3 
(NC_014297); Hsal, Halobacterium salinarum NRC-1 (NC_002607); Hma, Haloarcula marismortui (NC_006396); Bsu, Bacillus subtilis; Eco, £ coli. 
Numbers following each species name refer to gene unique identifiers in each genome. OE1405R in H. salinarum is a cross-reference to the 
corresponding gene in the R1 strain [41]. 



between cell density or growth phase and H2O2 resist- 
ance. Together, these phenotypic data suggest that (a) 
the VNG0258H protein is important for protection 
against oxidative stress caused by exposure to high 
levels of exogenous H2O2; and (b) cell density and 
H 2 0 2 resistance tend to co-vary. 



A VNG0258H is also markedly more susceptible to PQ 
stress than the parent strain. PQ added at the time of in- 
oculation slows the growth rate of both strains, though 
AVNG02S8H growth decreases more dramatically 
(Figure 4A; p < 6.0 x 1CT 13 ). Both strains grow normally 
in the presence of PQ up to about 12 hours, at which 
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Figure 3 AVNG0258H is impaired for growth and survival upon exposure to hydrogen peroxide (H 2 0 2 ). (A) Comparison of AVNG0258H and 
Aura3 parent growth rates under standard conditions across all experiments (n = 63, see also Additional file 3: Table SI). (B) Mean growth rates for 4 
biological replicate cultures treated with H 2 0 2 in lag phase. Blue bars represent Aura3 cultures; red bars represent AVNG0258H cultures. Concentration of 
H2O2 added is indicated on the x-axis. (C) Representative growth curves (1 of 4 biological replicates) for cultures treated with H 2 0 2 at beginning of 
growth (OD600 w 0.05). Line colors are as in (B). Thin, medium, and thick lines indicate H 2 0 2 added to a final concentration of 0, 5, or 6 mM, respectively 
(see legend; 7 mM curves omitted for clarity). Downward arrow indicates time of H 2 0 2 addition. Bracket indicates period for which mean growth rates 
were calculated. (D) Mean growth rates for 7 biological replicate cultures treated with H 2 0 2 in mid-logarithmic growth phase. (E) Representative growth 
curves for cultures treated with H 2 0 2 in mid-logarithmic phase (OD600 ps 0.3). Thin, medium, and thick lines indicate H 2 0 2 added to a final concentration 
of 0, 18.75, or 25 mM, respectively (curves for 6.25 and 12.5 mM conditions are omitted for clarity). Downward arrow indicates time of treatment. Bracket 
indicates period for which mean growth rates were calculated. In all bar graphs, error bars represent standard deviation. Asterisks represent statistically 
significant differences between AVNG0258H and parent strain Aura3 under the same growth conditions, where single asterisk indicates a p-value < 0.01, 
double asterisk indicates p< 0.001, and triple asterisk indicates p< 0.0001. All raw data are given in Additional file 3: Table SI. 



point growth rate slows significantly (Figure 4B). When 
PQ is added at mid-logarithmic phase, however, growth 
declines immediately after the PQ addition (Figure 4D). 
AVNG0258H is significantly more susceptible to PQ 
addition in mid-logarithmic phase than Aura3, with 
complete inhibition of growth observed at 0.333 mM 
PQ (Figure 4C; p < 1.6 x 10' 8 ). In contrast to H 2 0 2 re- 
sponse, PQ addition in lag and mid-logarithmic growth 
phases caused similar effects on growth, suggesting no 
relationship between cell density and susceptibility to 
PQ (e.g. compare Figure 4A to 4C). Together these PQ 
phenotypic data suggest that (a) VNG0258H is required 
for resistance to PQ exposure; and (b) PQ resistance of 
H. salinarum is independent of cell density. 



VNG0258H is required for appropriate gene expression 
dynamics in response to ROS induced by H 2 0 2 and PQ 

To determine whether VNG0258H plays a role in gene 
regulation, mRNA expression in the AVNG0258H dele- 
tion mutant and Aura3 parent backgrounds was moni- 
tored using microarrays, 40 and 20 minutes prior to 
H 2 0 2 and PQ treatment and at 10, 20, 40, 60, 80 minutes 
following H 2 0 2 or PQ treatment (Additional file 8: Figure 
S3 and Additional file 9: Figure S4, respectively). Three 
additional time points at 2 h, 8 h, and 24 h were moni- 
tored for PQ. Expression was measured using microar- 
rays spotted with probes against each of the H. 
salinarum NRC-1 open reading frames (ORFs; [43]; 
Methods). 
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Figure 4 AVNG0258H is impaired for growth and survival upon exposure to paraquat (PQ). (A) Mean growth rates for 7 biological 
replicate cultures treated in lag phase. Blue bars represent Aura3 cultures; red bars represent AVNG0258H cultures. (B) Representative growth 
curves (1 of 7 biological replicates) for cultures treated with PQ at beginning of growth phase (OD600 ss 0.05). Thin, medium, and thick lines 
indicate PQ added to a final concentration of 0, 0.167, or 0.333 mM, respectively (curve for 0.083 mM omitted for clarity). Downward arrow 
indicates time of PQ addition. Bracket indicates period for which mean growth rates were calculated. Line colors are as in (A). (C) Mean growth 
rates for 7 biological replicate cultures treated with PQ in mid-logarithmic growth phase. (D) Representative growth curves (1 of 7 biological 
replicates) for cultures treated with PQ in mid-logarithmic phase (OD600«± 0.3). Line widths indicate the same PQ concentrations as in (B). 
Representative curves for 0.083 mM condition are omitted for clarity. Asterisks and error bars are as in Figure 3. All raw data are given in 
Additional file 3: Table SI. 



Clusters of gene expression patterns in response to H 2 0 2 

As expected from previous studies [11], a substantial pro- 
portion of the genome (626 of 2,410 genes, 26%) exhibited 
changes in mRNA abundance in response to H 2 C>2 treat- 
ment in the Aum3 parent strain (Figure 5, Additional file 4: 
Table S2). Of these 626, 332 genes changed in abundance 
in response to H2O2 in the parent strain but were un- 
affected by the VNG0258H mutation (Figure 5H, J). These 
genes are considered "VNG0258H-independent". 294 of 
the 626 genes exhibited significant changes in the 
AVNG0258H mutant compared to the parent during H 2 0 2 
exposure (Figure 5, Additional file 8: Figure S3). According 
to significance analysis of microarrays (SAM), these genes 
fell into four distinct patterns, or clusters, of VNG0258H- 
dependent induction or repression. The first cluster 
includes 63 genes which exhibit a change in mRNA abun- 
dance in AVNG0258H relative to Aura3 regardless of H 2 0 2 
treatment (Figure 5A and B). The second cluster includes 
191 genes which require VNG0258H for appropriate 
expression in the presence of H 2 0 2 (Figure 5C and D). In 
this cluster, we detected time-resolved waves of 
VNG0258H-dependent activation of genes in response to 
H 2 0 2 (Figure 5D, Methods), with 43 genes activated within 
10 minutes of H 2 0 2 exposure ("early" genes), and 86 more 
within 40 minutes ("late" genes; Figure 5D). In contrast, 62 
genes requiring VNG0258H for repression in response to 



H 2 0 2 form a single, coherent cluster, with no waves 
detected (Figure 5C). The third cluster includes 27 genes 
which show increased expression in AVNG0258H in the 
absence of H 2 0 2 (Figure 5E). Finally, the fourth cluster 
includes 13 genes which exhibit altered dynamics in 
AVNG02S8H (Figure 5F and G). These genes exhibited an 
impulse-like wave of expression in the parent strain. Al- 
though the expression patterns of these genes were equiva- 
lent in A VNG0258H and the parent for the first 40 minutes 
following H 2 0 2 exposure, expression levels remained ele- 
vated compared to the parent level for the duration of the 
time course (Figure 5F). The converse pattern was also 
detected (Figure 5G). Across all four clusters combined, ap- 
proximately equal proportions of the 294 VNG0258H- 
dependent genes are under-expressed (48%; Figure 5B, D, 
E, G) as are over-expressed (52%; Figure 5A, C, F) in 
AVNG02S8H. Together these data suggest that VNG0258H 
(a) is bifunctional, required for the activation of some genes 
and the repression of others in response to H 2 0 2 (Figure 5); 
and (b) may be involved in fine-tuning of gene expression 
dynamics for a subset of genes. 

Clusters of gene expression patterns in response to PQ 

The mRNA levels for 188 genes changed in abundance in 
response to PQ addition to mid-logarithmic phase cul- 
tures but exhibited similar dynamic patterns in the parent 
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Figure 5 Gene expression in response to H 2 0 2 exposure in the Aura3 parent vs AVNC0258H mutant strains. Each line in each graph 
represents the mean expression profile of gene clusters that rely on VNG0258H for their appropriate expression (A-G) or those that respond to 
25 mM H2O2 treatment regardless of strain background (H and J). Time points before and after H2O2 exposure in the Aura3 parent strain (black) 
or AVNG0258H mutant (red) are divided by the dotted line. (A) Genes requiring VNG0258H for repression regardless of condition. (B) Genes 
requiring VNG0258H for activation regardless of condition. (C) Genes requiring VNG0258H for repression in the presence of H 2 0 2 . (D) Genes 
requiring VNG0258H for activation in the presence of H2O2. Dotted traces represent late waves of gene expression. (E) Genes requiring VNG0258H 
for repression in the absence of H 2 0 2 . (F) Genes requiring VNG0258H for impulse-like dynamic induction. (G) Genes requiring VNG0258H for 
impulse-like dynamic repression. (H) Genes induced in response to H 2 0 2 but independent of VNG0258H (note the difference in y-axis scale 
between F and H). (J) Genes repressed in response to H 2 0 2 but independent of VNG0258H. Gene expression profiles for individual genes in each 
cluster are shown in heat maps in Additional file 8: Figure S3. Detailed annotations for genes in each cluster are listed in Additional file 4: Table S2. 



strain and AVNG0258H. This indicates that these genes 
do not rely on VNG0258H for their response to PQ 
("VNG0258H-independent"; Additional file 5: Table S3, 
Figure 6D, E). In contrast, 61 genes were VNG0258H- 
dependent (Figure 6A, B, C). Of these 61, 7 genes were 
upregulated in AVNG0258H but unaffected by PQ in the 



Aura3 parent strain (Figure 6A). 30 genes were up- 
regulated dynamically in the Aum3 parent in response to 
PQ but were constitutively up-regulated in AVNG0258H, 
suggesting that VNG0258H is required to repress these 
genes during standard growth conditions and that this re- 
pression is relieved in response to PQ (Figure 6B). The 
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Figure 6 Gene expression in response to PQ exposure in the Aura3 parent vs. &VNG0258H mutant strains. Line graphs represent mean 
expression profiles for each duster of genes. Colors are as in Figure 5. The dotted line on each graph represents the time of PQ addition to each 
culture. (A) Genes requiring VNG0258H for repression regardless of PQ addition. (B) Genes requiring VNG0258H for repression in the absence of 
PQ and up-reguation in the presence of PQ. (C) Genes requiring VNG0258H for activation in the absence of PQ. (D) Genes induced in response to 
PQ but independent of VNG0258H. (E) Genes repressed in response to PQ but independent of VNG0258H. Gene expression profiles for individual 
genes in each cluster are shown in Additional file 9: Figure S4. Annotation details for genes in each cluster are listed in Additional file 5: Table S3. 



remaining 24 genes were down-regulated in response to 
PQ in the parent but remained low through the duration 
of the experiment in the AVNG0258H strain (Figure 6B 
and C). 

Upon comparison of the VNG0258H-dependent genes 
from the H2O2 experiment to those from the PQ experi- 
ment, we observed that 32 genes (50 including predicted 
operon members) were members of both lists, suggesting 
that these genes are dependent upon VNG0258H regardless 
of growth condition or stress treatment (Table 1). These 
genes are considered to be the core VNG0258H regulon. 

Functional enrichment in gene expression clusters 

According to archaeal Clusters of Orthologous Groups 
(arCOG) categories, the 294 VNG0258H-dependent genes 
(Figure 5) were found to be 2-fold enriched for functions 
in protein turnover/chaperones (category O) compared to 
the 332 VNG0258H-independent genes (p-value < 0.2 vs. 
0.96 for VNG0258H-independent genes, Additional file 4: 



Table S2). We also observed 2-fold enrichment in transla- 
tion (category J) and 1.5-fold in DNA recombination and 
repair (category L) (Figure 7A). For example, two gene 
products in the cluster dependent upon VNG0258H for 
impulse-like dynamics (Figure 5F and G) are predicted to 
function in DNA mismatch repair (i.e. mutSl, mutT) and 
one as a TF {i.e. VNG0704C; Figure 7 A, Additional file 4: 
Table S2). The majority of targets were of unknown func- 
tion (Figure 7 A; p < 0.025). 

Genes dependent upon VNG0258H for differential ex- 
pression in response to paraquat are mostly of unknown 
function (Figure 7B; p < 2.5 x 10 ), though they are also 
enriched for genes predicted to be involved in translation 
(Figure 7B). Genes that are VNG0258H-dependent in both 
PQ and H2O2 stress conditions have varied functions 
(Table 1), including transcriptional regulation (tfbA and 
Lrp-like regulator trhl), superoxide detoxification (sod2), 
and amino acid metabolism (e.g. histidine and arginine 
biosynthesis). 
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Table 1 RosR regulon with arCOG data (archaeal Clusters of Orthologous Groups) 



ORF 


Gene alias 


arCOG ID 


Category 


Protein function 


VNG0144H 


VNG0144H 


arCOG02761 


S 


Uncharacterized conserved protein 


VNG0255C 


VNG0255C 


arCOG02942 


L 


Ribonuclease HI 


VNG0256H 


VNG0256H 


arCOG04769 


S 


Uncharacterized conserved protein 


VNG0439C 


VNG0439C 


arCOG00570 


c 


Dehydrogenase (flavoprotein) 


VNG0485H 


VNG0485H 


arCOG09222 


s 


Uncharacterized conserved protein 


VNG0486G 


hatl 


arCOG00842 


J 


Acetyltransferase, RimL family 


VNG0487H 


VNG0487H 


arCOG00842 


J 


Acetyltransferase, RimL family 


VNG0488H 


VNG0488H 


arCOG04770 


I 


Acyl-CoA synthetase (AMP-forming)/AMP-acid ligase II 


VNG0506H 


VNG0506H 


arCOG09224 


s 


Uncharacterized conserved protein 


VNG0556G 


sgb 


arCOG02209 


R 


Polysaccharide biosynthesis protein, Mvin family 


VNG0777G 


taqD 


arCOG01222 


M 


Cytidylyltransferase fused to conserved domain of DUF357 family 


VNG0778C 


VNG0778C 


arCOG01139 


R 


Predicted metal-dependent protease of the PAD1/JAB1 superfamily 


VNG1041H 


VNG1041H 




NA 


NA 


VNG1201G 


fucA 


arCOG04226 


G 


Fuculose-1 -phosphate aldolase 


VNG1202C 


VNG1202C 


arCOG02291 


R 


HAD superfamily hydrolase 


VNG1204G 


gdhA2 


arCOG01352 


E 


Glutamate dehydrogenase/leucine dehydrogenase 


VNG1246H 


VNG1246H 


arCOG04608 


S 


Uncharacterized conserved protein 


VNG1330H 


VNG1330H 


arCOG07569 


S 


Uncharacterized conserved protein 


VNG1332G 


sod2 


arCOG04147 


P 


Superoxide dismutase 


VNG1343C 


VNG1343C 


arCOG04303 


R 


Uncharacterized Rossmann fold enzyme 


VNG1404G 


trhl 


arCOG02815 


K, 0 


Putative transcripion factor, Lrp family (K). Conserved domain 
frequently associated with peptide methionine sulfoxide reductase I 


VNG1425H 


VNG1425H 


arCOG04789 


S 


Uncharacterized conserved protein 


VNG1444G 


NsD 


arCOG04352 


E 


Histidinol dehydrogenase 


VNG1474G 


est 


arCOG01648 


R 


Alpha/beta superfamily hydrolase 


VNG1533H 


VNG1533H 


arCOG06229 


S 


Uncharacterized conserved protein 


VNG1589C 


VNG1589C 


arCOG09277 


S 


Uncharacterized conserved protein 


VNG1749G 


gbpl 


arCOG00357 


J 


Predicted GTPase, probable translation factor 


VNG1948H 


VNG1948H 


arCOG04525 


s 


Uncharacterized conserved protein 


VNG1963H 


VNG1963H 




NA 


NA 


VNG2184G 


tfbA 


arCOG01981 


K 


Transcription initiation factor TFIIIB, Brf1 subunit/Transcription 
initiation factor TFIIB 


VNG2286G 


rnarnA 


arCOG01710 


I 


Methylmalonyl-CoA mutase, C-terminal domain/subunit 
(cobalamin-binding) 


VNG2288G 


mamB 


arCOG06231 


E 


Glutamate mutase epsilon subunit 


VNG2289G 


mal 


arCOG06232 


E 


Methylaspartate ammonia-lyase 


VNG2290G 


maoCl 


arCOG00775 


I 


Acyl dehydratase 


VNG2291G 


cot 


arCOG061 24 


c 


Acety -CoA hydro ase 


VNG2376H 


VNG2376H 


arCOG04728 


S 


Uncharacterized conserved protein 


VNG2444C 


VNG2444C 


arCOG01141 


R 


Phosphoesterase 


VNG2556H 


VNG2556H 


arCOG09321 


S 


Uncharacterized conserved protein 


VNG2570G 


dcd 


arCOG04048 


F 


Deoxycytidine deaminase 


VNG2591C 


VNG2591C 


arCOG02264 


S 


Predicted membrane protein 


VNG2593H 


VNG2593H 


arCOG03026 


0 


Thioredoxin-like protein 


VNG2594C 


VNG2594C 


arCOG09323 


s 


Uncharacterized conserved protein 
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Table 1 RosR regulon with arCOG data (archaeal Clusters of Orthologous Groups) (Continued) 



VNG2669G 


cyo 


arCOG04471 


S 


Predicted membrane protein 


VNG5143C 


VNG5143C 


arCOG09333 


R 


Predicted permease 


VNG5157H 


VNG5157H 




NA 


NA 


VNG5164C 


VNG5164C 


arCOG04311 


R 


Predicted hydrolase of HD superfamily 


VNG6275H 


VNG6275H 




NA 


NA 


VNG6276H 


VNG6276H 


arCOG09354 


S 


Uncharacterized conserved protein 


VNG6312G 


argS 


arCOG00487 


J 


Arginyl-tRNA synthetase 


VNG6313G 


nhaC3 


arCOG02010 


c 


Na+/H -tantiporter 



ORF - open reading frame number from H. salinarum NRC-1 genome [44]; Gene alias - where present, common four-letter name of protein; arCOG ID, 
identification code for each arCOG sub-group annotation [48]; Category - arCOG category letter designation for each protein (as in Figure 7, Additional file 4: 
Table S2, Additional file 5: Table S3); protein function - function of predicted protein product as annotated by arCOGs; NA, not in arCOGs. 



Surprisingly, we also detected functional enrichment for 
lipid transport and metabolism (category I, p < 0.05) and 
cell wall biogenesis (category M, p = 0.0515) for genes that 
are responsive to H 2 G" 2 but independent of VNG0258H 
regulation. We did not detect these enrichments in re- 
sponse to PQ. Together these functional enrichments sug- 
gest that (a) VNG0258H plays an important role in the 
regulation of protein production and/or turnover and 
DNA repair systems as well as currentiy uncharacterized 
cellular processes; and (b) lipid metabolism and cell wall 
biogenesis functions may be important in the specific re- 
sponse to H 2 0 2 in this organism. In sum, the H 2 0 2 and 
PQ gene expression data (Figure 5, Figure 6, Additional 
file 8: Figure S3 and Additional file 9: Figure S4, Add- 
itional file 4: Table S2 and Additional file 5: Table S3) sug- 
gest that VNG0258H is a bifunctional regulator of genes 
whose products are required for ROS resistance. 
VNG0258H may be required for impulse-like dynamics 
and time-resolved waves of gene expression in response to 
H 2 0 2 but not PQ. We will therefore henceforth refer to 
VNG0258H as RosR, or reactive oxygen species regulator. 



RosR binds directly to the chromosomal locus encoding 
superoxide dismutase 

To determine if RosR's effects on gene expression are 
mediated via direct interaction with DNA, we performed 
in vivo binding analysis using chromatin immunoprecipita- 
tion (ChIP [29]) coupled to quantitative PCR (qPCR [27]). 
We detected direct RosR-DNA binding to the sod2 locus, 
whose product functions as a manganese-binding super- 
oxide dismutase in H. salinarum [11]. The sod2 transcript 
is also significantly activated in ArosR regardless of which 
oxidant is added (Figure 8B, 8C, Table 1). Under standard 
conditions (mid-log phase, rich medium, 37°C), the sod2 
locus is 2.5-fold enriched for binding to RosR over the con- 
trols, which included mock input and TrmB transcription 
factor (previously shown not to bind the sod2 locus, [20]; 
Figure 8A). These results demonstrate that RosR binds to 
DNA under standard growth conditions. Combined with 



the gene expression microarray experiments (Figures 5 and 
6), these data suggest that RosR-DNA binding is associated 
with repression of sod2 transcriptional activity. 

Refining the gene regulatory network 

To assess the accuracy of the GRN predictions [11], we 
compared the gene expression results described here 
(Figures 5 and 6) to the predictions of the GRN (Figure 1). 
Predictions from the model suggested that RosR regulates 
cobalamin biosynthesis (cbij, gene set 91, Figure 1) and 
oxidoreductase genes (yaj02, gene set 6, Figure 1), which 
our H 2 0 2 gene expression results have confirmed 
(Figure 5C, Additional file 4: Table S2). However, the pre- 
diction that RosR regulates genes involved in iron homeo- 
stasis (e.g. siderophore uptake genes iucABC; gene set 12) 
was not confirmed here (i.e. the expression of these genes 
were not significantly affected by the ArosR deletion). 

We also explored the GRN for cis-regulatory sequence 
predictions. Of the three sets of genes that were predicted to 
be RosR-dependent (set 6, 12, and 91, Figure 1), only set 12 
contained a cis-regulatory sequence prediction [11]. There- 
fore, we conducted de novo motif discovery using the MEME 
algorithm (see Methods). We conducted two different com- 
putational searches, including (a) phylogenetic footprinting 
[24] with the sod2 promoter sequences from all haloarchaeal 
genomes with a predicted RosR homolog (Figure 2), and (b) 
searches using promoters sequences of all 50 genes shared 
between the PQ and H 2 0 2 datasets (Table 1). Using MEME, 
we detected a set of related putative motifs, each of which 
has a high likelihood of containing a central palindromic 
TCG-N-CGA motif (p < 7 x 10' 56 , Additional file 10: Figure 
S5) flanked by consensus sequences of varying strength. 
Taken together, the gene expression and putative cis- 
regulatory sequence results described here confirm and re- 
fine the statistically inferred GRN prediction [11]. 

Discussion 

Here we have used a systems biology approach to iden- 
tify and characterize a novel transcription factor, RosR. 
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Figure 7 Genes dependent on RosR and responsive to ROS (H 2 0 2 and PQ) are enriched for functions in protein and DNA repair. (A) 

Predicted functions of genes differentially expressed in response to H 2 0 2 according to archaeal Clusters of Orthologous Groups (arCOG) 
categories [48]. Category annotations are listed in on the Y-axis. Black bars represent the number of genes in each category dependent upon 
RosR for their differential expression (Figure 5A-G), whereas grey bars enumerate genes in each category that are differentially expressed in 
response to H 2 0 2 but not affected by the Arosft mutation {i.e. "RosR-independent", Figure 5H and J). (B) Predicted functions of genes differentially 
expressed in response to paraquat (PQ) according to arCOG. Colors and category annotations are as in (A). Asterisks denote significant 
overrepresentation of a functional category with p < 0.05. Plus signs (+) indicate enrichment p < 0.2. Details of gene annotations, membership in 
each arCOG category, and p-values for enrichment for all categories are listed in Additional file 4: Table S2 for H 2 0 2 data and Additional file 5: 
Table S3 for PQ data. 
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Time relative to H202 addition (min) Time relative to PQ addition (min) 



sod2 expression level in ArosR 

sod2 expression level in parent 

Figure 8 ChlP-qPCR data suggest that RosR binds DNA directly. Relative enrichment ratio is shown for a 100-bp region in the sod2 promoter 
region in immunoprecipitates (IP) compared to randomly sheared chromosomal DNA (whole cell extract, WCE). Enrichments are compared for 
the putative VNG0258H transcription factor, empty plasmid ("mock") and TrmB (a transcription factor known not to bind to the sod2 locus [20]). 
Error bars represent +/- SEM from the mean of 4 biological replicate experiments. (B) sod2 (superoxide dismutase) is overexpressed in ArosR vs. 
the parent strain in H 2 0 2 gene expression experiments. Red trace represents microarray gene expression data (also shown in Figure 5A) for sod2 
in ArosR, whereas the black trace is for the Aura3 parent strain. (C) sod2 (superoxide dismutase) is overexpressed in Aro5ft vs the parent strain in 
PQ gene expression experiments. Colors are as in (B). 



This protein is required for survival in the face of ex- 
tremely high levels of ROS exposure (Figures 3 and 4), as 
it activates and represses genes encoding macromolecule 
repair and cellular maintenance functions (Figures 5, 6, 7). 
It directly binds the promoter of sod2 (Figure 8). Although 
future studies are necessary to differentiate whether the 
remaining genes are direct or indirect targets of RosR, 
these results support the conclusion that RosR may be a 
bifunctional transcription factor that regulates the extreme 
ROS response of H. salinarum. 

H. salinarum grows at remarkably high PQ and H2O2 
stress conditions [9,11] compared to ROS-sensitive 
species such as E. coli [3], which the results reported 
here corroborate (Figures 3, 4). For example, we found 
that more than 0.3 mM PQ is required to decrease H. 
salinarum Aura3 growth rate to half its original rate, 
whereas it takes only about 0.1 mM PQ to achieve a 
comparable decrease in growth rate in E. coli B [53]. 
Similarly, 1 mM H 2 0 2 is lethal to E. coli [54], whereas 
H. salinarum survives up to 25 mM H 2 0 2 . 

Similar to E. coli and other mesophiles, H 2 0 2 resist- 
ance in H. salinarum is proportional to cell density, 
whereas PQ resistance is independent of cell number 
(Figures 3 and 4, [55]). Previous studies in E. coli suggest 
that H 2 0 2 scavenging capacity is higher in dense cul- 
tures due to increased concentration of scavenging 
enzymes [56]. In contrast, PQ is a redox cycling drug 
that continually produces endogenous ROS in the cell 
membrane and so cannot be cleared from the culture 
during growth [57]. This difference in chemistry of the 
oxidants and the different responses of H. salinarum 



observed here suggest that PQ may be a better proxy for 
ROS damage resulting from continuous UV exposure in 
the salt lake environment. 

H. salinarum uses a battery of enzymatic and non- 
enzymatic strategies to withstand macromolecular damage 
in its highly oxidative, saturated salt habitat, including ge- 
netic redundancy of DNA repair and antioxidant enzyme- 
coding genes [4,6,7]. Interestingly, among the functionally 
redundant DNA repair genes MutT and MutS, we found 
that RosR regulates only one of each of the paralogs {e.g. 
mutSl and not mutS2l3). This suggests that the function of 
enzymes encoded by these genes could be only partially re- 
dundant. Alternatively, dynamic regulation of each may 
contribute to differential timing of expression and function. 
In contrast, both superoxide dismutase genes {sodl and 
sod2) are differentially expressed in ArosR in response to 
H 2 0 2 (Additional file 4: Table S2). Combined with the 
growth data results that the ArosR mutant growth defect is 
most dramatic during exposure to high concentrations of 
PQ and H 2 0 2 (Figures 3 and 4), our findings suggest that 
RosR regulation represents another important component 
of the mechanism for ROS protection and repair in this en- 
vironment and, by homology, perhaps also in other 
haloarchaea. 

Our results suggest that RosR may play additional 
roles in cellular physiology. A large proportion of the 
genes dependent upon RosR for appropriate differential 
expression are of unknown function (30% of genes in re- 
sponse to H 2 0 2 and 45% in response to PQ, Figure 7). 
In addition, RosR activates and represses genes that are 
independent of oxidant treatment (Figures 5A, B and 
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Figure 9 Comparison of GRN topology from previous studies with the RosR regulon characterized here. (A) Subnetwork diagram 
depicting predictions regarding putative VNG0258H function from the ROS environmental gene regulatory inference network (EGRIN) (adapted 
from [1 1]). Circles represent transcription factor (TF) groups. TF group 23 includes VNG0258H and four other putative TFs (VNG0101G, VNG0347G, 
VNG1496G, and VNG0890G). Diamonds represent combinatorial logic gates (AND). Blunt arrows represent computationally inferred repression 
influences, whereas pointed arrows are activation influences. Squares represent inferred clusters of co-regulated genes and numbers within the 
squares refer to EGRIN cluster IDs. (B) Refinements to GRN model based on the work presented in the current study. Node shapes and edge 
attributes are as in (A). Solid lines indicate direct regulation (i.e. DNA-protein interaction has been detected), whereas dotted lines represent 
interactions that could be direct or indirect (i.e. direct interaction still needs to be tested). Dark grey boxes within light grey boxes indicate that, 
out of all genes in the cluster, only the gene indicated in dark grey is RosR-dependent based on the microarray data from the current study. 



6A). In previous studies, many of these genes were also 
induced in response to other conditions that damage 
macromolecules (e.g. UV and gamma radiation, [2]). In 
addition, although no growth defect was observed under 
standard aerobic growth conditions (Figures 3 and 4), it 
remains formally possible that RosR is involved in regu- 
lating gene expression in response to oxygen shifts in 
cooperation with other TFs, especially given the strong 
correlation of the rosR gene expression profile with oxy- 
gen shifts (Figure ID). Future work will investigate the 
role of RosR in the response to such conditions. 

Other TFs are likely to be involved in the ROS re- 
sponse in H. salinarum. Our gene expression data sug- 
gest that genes previously implicated in ROS protection 
and damage repair in this organism do not require RosR 



in response to H2O2 and PQ shock (e.g. non- 
homologous end joining and base excision repair path- 
ways, thioredoxin, and catalase [11]; Figure 5H and I, 
Figure 6D and E, Figure 7). Candidate TFs for such regu- 
lation include VNG0101G, VNG0347G, VNG1496G, and 
VNG0890G, which are nearest neighbors to RosR in the 
existing network (Figure 9) [11]; or the Trhl and TfbA 
transcription factors, whose corresponding genes require 
RosR for appropriate expression (Table 1). 

Our results refine the previously published GRN 
model [11]. According to the model, RosR is combined 
within the same regulatory node with four other TFs 
(Figure 9A). Together, these genes are predicted to influ- 
ence the expression of genes involved in oxidative stress 
repair and metal homeostasis, cobalamin biosynthesis, 
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and redox reactions. Here we have differentiated which 
genes included in this prediction are RosR-dependent 
and which may be dependent on the four other TFs 
(Figure 9B). We have also added cis-regulatory sequence 
predictions that were missing from the initial model. Al- 
though the predicted cis-regulatory sequence detected in 
sod2 promoters is relatively degenerate (i.e. only 6 nt 
long), the conservation of a putative cis-regulatory se- 
quence in the promoter of sod2 with those from other 
haloarchaea is consistent with the idea of an evolutionar- 
ily conserved RosR function (Additional file 10: Figure 
S5). Second, we observed that RosR is required for 
impulse-like, time-resolved waves of gene expression in 
response to H 2 C>2 (Figure 5). Previous theoretical studies 
suggest that such dynamics could result either from 
autoregulatory feedback or feed-forward loops com- 
prised of two TFs [58]. Thus, RosR could regulate itself 
or work in concert with other transcription factor(s) (see 
candidates above) to bring about impulse-like dynamics. 

RosR is highly conserved among haloarchaeal species 
but poorly conserved among other archaea and bacteria 
(Figure 2). Indeed, four other paralogs of RosR (E-value 
< 5 x 1CT 19 ) are present in the genome of H. salinarum 
alone [32]. In other archaea, only one other ROS -specific 
transcription factor, MsvR in methanogens, has been iden- 
tified and characterized [24]. Like RosR, MsvR also 
appears to be restricted to a small subset of species [24] 
and functions to repress oxidative stress genes, suggesting 
interesting evolutionary questions. Sulfonylation [52] or 
oxidation of cysteine residues [57,59] is the primary mech- 
anism for conformational changes of redox-responsive 
transcription factors in bacteria and has been hypothesized 
for MsvR. These conformational changes influence inter- 
actions with DNA. The RosR protein lacks cysteine resi- 
dues and other typical sequences in the effector domains 
(Figure 2), so the biochemical mechanism by which RosR 
binds DNA and senses oxidants remains unclear. 

Conclusions 

We conclude that RosR is a haloarchaeal-specific, wHTH 
transcription factor important for gene regulation in re- 
sponse to highly oxidative conditions. We further suggest 
that RosR is an important node in a large, interconnected 
gene regulatory network (GRN) regulating the response to 
oxidative stress. This study lays groundwork for under- 
standing how the haloarchaea may have evolved to thrive 
in their extremely oxidative, hypersaline niche. 

Additional files 



Additional file 1: Table S4. Lists primers and strains used in this study. 
Additional file 2: Supplementary Methods & Results. 



Additional file 3: Table SI. Includes raw and analyzed cell density data 
(as OD600 values) from each growth curve experiment in the Bioscreen C 
instrument (main text Figures 3 and 4, Additional file 6: Figure SI and 
Additional file 7: Figure S2). Please see legends for information regarding 
each section of the Table. 

Additional file 4: Table S2. All gene expression microarray data, 
annotation details, and arCOG memberships for each gene cluster from 
main text Figure 5 (H 2 0 2 exposure) are listed. Please see the tab labeled 
"legend" for information regarding each section of the Table. 

Additional file 5: Table S3. Gene expression microarray data and 
arCOG functional annotations for paraquat (PQ) gene expression data. 
Please see the tab labeled "legend" for information regarding each 
section of the Table. 

Additional file 6: Figure SI. Growth in batch culture is similar to that 
in the Bioscreen C. (A) Top: comparison of growth yield under standard 
conditions (i.e. no stress) in batch vs. Bioscreen C. Aura3 and AVNG0258H 
maximum cell density (OD600) are shown for the mean of 5 biological 
replicate samples with 2 technical replicates each. Error bars represent 
standard deviation from the mean. Bottom: comparison of growth rates 
under standard conditions in batch culture vs. Bioscreen. Columns and 
error bars are as in (A). (C) Representative growth curves for Aura3 parent 
strain and AVNG0258H mutant strains in response to H 2 0 2 added in mid- 
logarithmic phase in batch culture. Addition of H 2 0 2 indicated by arrow. 
Cell density (OD600) was measured in a standard spectrophotometer at 
the times indicated. Strains and conditions are indicated in the legend. 
(D) Representative growth curves in batch culture under paraquat (PQ) 
conditions. 

Additional file 7: Figure S2. A wild type copy of the VNG0258H gene 
supplied on a plasmid (pMTFcmyc vector, [1]) complements the 
AVNG0258H growth defects. (A) Box-whisker plots depicting growth rates 
of H. salinarum strains in the bioscreen C (Aura3 parent, AVNG0258H 
mutant, and AVNG0258H mutant complemented in trans) during the 12 
hours following H 2 0 2 shock (mid-logarithmic phase addition of H 2 0 2 ). 
Horizontal lines within each box represent the median growth rate across 
24 replicate trials (8 biological replicates, 3 technical replicates) for each 
strain in each condition. Boxes represent the interquartile range (IQR), 
and whiskers are minimum and maximum values within 1.5 times the 
IQR. Concentrations of H 2 0 2 added are indicated on the X-axis, whereas 
the Y-axis quantifies growth rate. (B) Box-whisker plot depicting lag phase 
addition of H 2 0 2 to Bioscreen cultures. Boxes, median lines, and whiskers 
are as in (A). Y-axis expresses the growth rate of the AVNG0258H or trans- 
complemented strains as a function of Aura3 growth rate. (C) Box- 
whisker plot depicting survival ratios 24 hours after mid-logarithmic 
phase addition of 25 mM H 2 0 2 to batch cultures. (D) Box-whisker plot 
depicting growth rates following mid-logarithmic phase addition of PQ 
to batch cultures. Growth rates are expressed as a function of Aura3 
parent strain growth. 

Additional file 8: Figure S3. Detailed heat map for each gene cluster 
from main text Figure 5 (H 2 0 2 exposure). Data for those genes 
dependent on VNG0258H for appropriate expression are shown (i.e. main 
text Figures 5A-G). Gene names are listed on the right of each heat map. 
Detailed annotations and COG category memberships (main text Figure 
7A) for each these genes are listed in Additional file 4: Table S2. In each 
heat map, red represents induction, whereas blue represents repression. 
VNG0258H-independent genes (Cluster 4, Figure 5H and J) are not 
included in the Figure for brevity and clarity, but expression data and 
annotations for these genes are included in Additional file 4: Table S2. (A) 
Cluster 1 includes genes that were differentially expressed in the 
AVNG0258H mutant vs Aura3 parent strain regardless of growth 
condition (main text Figures 5A-B). Cluster la (main text Figure 5A) 
depicts those 33 genes that are over-expressed in the AVNG0258H 
mutant (i.e. RosR is required to repress these genes). Cluster lb (main 
text Figure 5B) depicts those 30 genes that are under-expressed in the 
AVNG0258H mutant (i.e. VNG0258H is required to activate these genes). 
(B) Cluster 2 includes genes that were differentially expressed in the 
AVNG0258H mutant vs Aura3 parent strain in the presence of H 2 0 2 (main 
text Figures 5C-D). Cluster 2a (main text Figure 5C) contains those 43 
genes that are over-expressed in the AVNG0258H mutant in response to 
H 2 0 2 (i.e. VNG0258H is required to repress these genes in response to 
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H 2 0 2 ). Cluster 2b (main text Figure 5D) contains those genes that are 
under-expressed in the AVNG0258H mutant in response to H 2 0 2 (i.e. 
VNG0258H is required to induce them). (C) Cluster 3 includes genes that 
were differentially expressed in the AVNG0258H mutant vs Aura3 parent 
strain in the absence of H 2 0 2 (main text Figure 5E). (D) Growth of Aura3 
parent and AVNG0258H cultures for gene expression microarray analysis. 
Black curves represent growth data for the two biological replicate 
cultures of Aura3, whereas red curves are data for the two biologica 
replicate cultures of AVNG0258H. Dotted arrows on the curves indicate 
the start and end of sampling over the time courses shown in the heat 
maps, whereas the solid arrow shows the time of H 2 0 2 addition to the 
cultures. 

Additional file 9: Figure S4. Detailed heat map for each gene cluster 
from main text Figure 6. Data for those genes dependent on RosR for 
appropriate expression in response to PQ are shown (main text Figures 
6A-C). Colors and labels are as in Additional file 8: Figure S3. (A) Heatmap 
for Cluster 1, genes differentially expressed in ArosR vs the Aura3 parent 
strain regardless of growth condition (main text Figure 6A). (B) Heatmap 
for Cluster 2, genes dependent upon RosR for differential expression in 
response to paraquat (PQ). Genes upregulated in the mutant are shown 
on the left (main text Figure 6B) and those downregulated are shown on 
the right (main text Figure 6C). (C) Genes differentially expressed in 
response to PQ that are independent of RosR. Upregulated genes are 
shown (main text Figure 6D). Downregulated genes (171 genes) are not 
shown for brevity, but are listed in Additional file 5: Table S3. (D) Growth 
data for cultures from which RNA was harvested for microarray studies. 
Red arrow indicates the time of PQ addition. 

Additional file 10: Figure S5. Putative cis-regulatory sequences 
resulting from MEME analysis on (A) the 50 genes differentially expressed 
in common in the PQ and H202 gene expression datasets (main text 
Table 1), and (B) phylogenetic footprinting using sod2 promoter 
sequences from all halophilic archaea possessing a RosR homolog. Each 
sequence logo represents a different ci's sequence prediction. The height 
of the letters in each nucleotide position represents the strength of the 
consensus between the input sequences. The putative TCG-N-CGA motif 
is boxed in each case. In (A), the top-scoring two motifs from MEME 
searches are shown. Top motif p-value is 7.0x1 0" 56 , and bottom motif p- 
value is 2.6x1 0' 42 . 43 of the 50 promoter query sequences contained 
each motif. In (B), only the top-scoring motif is shown. 
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